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Abstract Coronavirus genome replication and transcription take place at cytoplasmic 
membranes and involve coordinated processes of both continuous and discontinu- 
ous RNA synthesis that are mediated by the viral replicase, a huge protein complex 
encoded by the 20-kb replicase gene. The replicase complex is believed to be com- 
prised of up to 16 viral subunits and a number of cellular proteins. Besides RNA-de- 
pendent RNA polymerase, RNA helicase, and protease activities, which are common 
to RNA viruses, the coronavirus replicase was recently predicted to employ a variety 
of RNA processing enzymes that are not (or extremely rarely) found in other RNA 
viruses and include putative sequence-specific endoribonuclease, 3’-to-5’ exoribonu- 
clease, 2’-O-ribose methyltransferase, ADP ribose 1”-phosphatase and, in a subset of 
group 2 coronaviruses, cyclic phosphodiesterase activities. This chapter reviews (1) 
the organization of the coronavirus replicase gene, (2) the proteolytic processing of 
the replicase by viral proteases, (3) the available functional and structural informa- 
tion on individual subunits of the replicase, such as proteases, RNA helicase, and the 
RNA-dependent RNA polymerase, and (4) the subcellular localization of coronavirus 
proteins involved in RNA synthesis. Although many molecular details of the corona- 
virus life cycle remain to be investigated, the available information suggests that 
these viruses and their distant nidovirus relatives employ a unique collection of en- 
zymatic activities and other protein functions to synthesize a set of 5’-leader-con- 
taining subgenomic mRNAs and to replicate the largest RNA virus genomes current- 
ly known. 
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1 
Introduction 


Plus-strand (+) RNA viruses exhibit an enormous genetic diversity that 
also applies to their RNA synthesis machinery. The RNA-depen- 
dent RNA polymerase (RdRp) is the only enzyme to be absolutely con- 
served, whereas other replicative and accessory protein domains vary 
considerably, in terms of both number and arrangement in the polypro- 
tein (Koonin and Dolja 1993). Despite this diversity, phylogenetic rela- 
tionships have been identified and used to group +RNA viruses into 
large superfamilies (or classes) (Goldbach 1987; Strauss and Strauss 
1988; Koonin and Dolja 1993). As few as three superfamilies, the pico- 
rnavirus-like, flavivirus-like and alphavirus-like viruses, were proposed 
to accommodate the vast majority of +RNA viruses infecting animals, 
plants, and microorganisms (Koonin and Dolja 1993). Interestingly, 
coronaviruses were among the few exceptions that did not easily fit into 
one of the established superfamilies; and the sequence analysis and 
characterization of arteri-, toro-, and roniviruses suggested that coron- 
aviruses and their relatives may indeed exemplify a viral life form that, 
in several fundamental aspects, differs from that of other +RNA viruses 
(Gorbalenya et al. 1989c; Snijder et al. 1990a; den Boon et al. 1991; Sni- 
jder and Horzinek 1993; de Vries et al. 1997; Lai and Cavanagh 1997; 
Snijder and Meulenberg 1998; Cowley et al. 2000). Thus coronaviruses 
(and all their relatives) (1) produce a nested set of 3’-coterminal mRNAs 
(Lai et al. 1983; Spaan et al. 1983), (2) use ribosomal frameshifting into 
the -1 frame to express their key replicative functions (Brierley et al. 
1987, 1989), (3) have a unique set of conserved functional domains that 
are arranged in the viral polyproteins in the following order: chymo- 
trypsin-like proteinase, RdRp, helicase, and endoribonuclease (from 
N- to C-terminus) (Gorbalenya et al. 1989c; Gorbalenya 2001; Snijder et 
al. 2003), and (4) use RdRp and helicase activities that, based on the 
conservation of signature motifs, have been classified as belonging to 
the RdRp and helicase superfamilies 1, respectively (Koonin and Dolja 
1993). Both the combination of two superfamily 1 domains and their se- 
quential order in the polyprotein, with RdRp preceding the helicase, is 
extremely unusual (if not unique) among +RNA viruses. On the basis of 
these and other common properties, a new virus order, the Nidovirales, 
was introduced several years ago (Cavanagh 1997). At present, there is 
only little information on the toro- and ronivirus replicases, whereas in- 
formation on the replicases of corona- and arteriviruses is accumulating 
rapidly. On the basis of both serological relationships and sequence sim- 
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ilarity, coronaviruses have been classified into three groups (Siddell 
1995), with human coronavirus 229E (HCoV-229E, group 1), porcine 
transmissible gastroenteritis virus (TGEV, group 1), mouse hepatitis vi- 
rus (MHV, group 2), and avian infectious bronchitis virus (IBV, group 3) 
being the best-studied coronaviruses to date. Because of its medical im- 
portance, SARS coronavirus (SARS-CoV) (tentatively classified as be- 
longing to group 2) (Snijder et al. 2003) is currently becoming a major 
topic of coronavirus research. 


2 
Organization and Expression of the Replicase Gene 


Complete genome sequences are currently available for seven species of 
coronaviruses, IBV (Boursnell et al. 1987), MHV (Bredenbeek et al. 1990; 
Lee et al. 1991; Bonilla et al. 1994), HCoV-229E (Herold et al. 1993), 
TGEV (Eleouet et al. 1995; Penzes et al. 2001), porcine epidemic diarrhea 
virus (PEDV) (Kocherhans et al. 2001), bovine coronavirus (Chouljenko 
et al. 2001), and SARS-CoV (Marra et al. 2003; Rota et al. 2003). In some 
cases (for example, SARS-CoV) complete genome sequences are avail- 
able for several or even multiple isolates (Ruan et al. 2003). The genome 
sizes of coronaviruses range between 27.3 (HCoV-229E) and 31.3 
(MHV) kb, making coronaviruses the largest RNA viruses currently 
known. About two-thirds of the coronavirus genome (~20,000 bases) are 
devoted to encoding the viral replicase that mediates viral RNA synthe- 
sis (Thiel et al. 2001b) and, possibly, other functions. The replicase gene 
is comprised of two large open reading frames, designated ORFla and 
ORF1b, that are located at the 5’ end of the genome. The upstream 
ORFla encodes a polyprotein of 450-500 kDa, termed polyprotein 
(pp)la, whereas ORFla and ORFI1b together encode pplab (750- 
800 kDa) (Fig. 1). Expression of the C-terminal, ORF1b-encoded half of 
pplab requires a (-1) ribosomal frameshift during translation. It is gen- 
erally accepted that frameshifting depends on two critical elements, the 
“slippery” sequence, UUUAAAC, at which the ribosome shifts into the 
(-1) reading frame and a tripartite RNA pseudoknot structure located 
more downstream, near the ORFla/1b junction (Brierley et al. 1987, 
1989; Herold and Siddell 1993). In vitro experiments using reticulocyte 
lysates indicate that frameshifting occurs in about 20%-30% of the 
translation events, but it is not known whether this reflects the situation 
in vivo. The fact that the core replicative functions, RdRp and helicase, 
are encoded by ORF1b implies that their expression critically depends 
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SARS corona- 


CoV), and avian infectious bronchitis virus (IBV). The processing end- 


Fig. 1. Overview of the domain organization and proteolytic processing of coronavi- 
rus replicase polyproteins. Shown are the replicase polyproteins ppla and pplab of 


human coronavirus 229E (HCoV-229E), mouse hepatitis virus (MHV), 


virus (SARS 
products of ppla are designated nonstructural proteins (nsp) 1 to nsp11, and those 
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on ribosomal frameshifting, suggesting a requirement for a specific mo- 
lar ratio between ORFla- and ORF1b-encoded protein functions. 


3 
Replicase Polyproteins 


3.1 
Functional Domains 


Initial sequence analyses in the late 1980s suggested a large divergence 
of the coronavirus replicase from the replicative machinery of other 
+RNA viruses. Accordingly, at this time, only very few functional predic- 
tions could be made for the ~800-kDa replicative polyproteins of coron- 
aviruses (Boursnell et al. 1987). In 1989, a detailed comparative sequence 
analysis of the IBV replicase gene (Gorbalenya et al. 1989c) was pub- 


ee ——————————SSS——_—_————_— eS 


of pplab are designated nsp1 to nsp10 and nsp12 to nsp16. Note that nsp1 to nsp10 
may be released by proteolytic processing of either ppla or pplab, whereas nsp11 is 
processed from ppla and nsp12 to nspl6 are processed from pplab. nsp11 and 
nsp12 share a number of residues at the N-terminus. Alternative names that have 
been used in the past to designate specific processing products are given. Cleav- 
age sites that are processed by the viral main proteinase are indicated by red 
arrowheads, and sites that are processed by the accessory papainlike proteinases 1 
and 2 are indicated by orange and blue arrowheads, respectively. Ac, acidic domain 
(Ziebuhr et al. 2001); PL1, accessory papainlike cysteine proteinase 1 (Baker et al. 
1989, 1993; Gorbalenya et al. 1991; Herold et al. 1998); X, X domain (Gorbalenya et 
al. 1991), which is predicted to have adenosine diphosphate-ribose 1”-phosphatase 
activity (Snijder et al. 2003); SUD, SARS-CoV unique domain (Snijder et al. 2003); 
PL2, accessory papainlike cysteine proteinase 2 (Gorbalenya et al. 1991; Liu et al. 
1995; Kanjanahaluethai and Baker 2000; Ziebuhr et al. 2001); Y, Y domain containing 
a transmembrane domain and a putative Cys/His-rich metal-binding domain; TM1, 
TM2, and TM3, putative transmembrane domains 1 to 3; 3CL, 3C-like main protein- 
ase (Gorbalenya et al. 1989c; Liu and Brown 1995; Ziebuhr et al. 1995; Lu et al. 1995); 
RdRp, putative RNA-dependent RNA polymerase domain (Gorbalenya et al. 1989c); 
HEL, helicase domain (Seybert et al. 2000a); ExoN, putative 3’-to-5’ exonuclease 
(Snijder et al. 2003); XendoU, putative poly(U)-specific endoribonuclease (Snijder et 
al. 2003); MT, putative S-adenosylmethionine-dependent ribose 2'-O-methyltransfer- 
ase (Snijder et al. 2003); C/H, Cys/His-rich domains predicted to bind metal ions. 
Note that IBV ppla and pplab do not have a counterpart of nsp1 of other coron- 
aviruses. The papainlike cysteine proteinase 1 of IBV is crossed out to indicate that 
the domain is proteolytically inactive 
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lished in which the RdRp and NTPase/helicase domains were predicted 
to be encoded by the 5’ region of ORF1b. Furthermore, a putative chy- 
motrypsin-like (picornavirus 3C-like) cysteine proteinase domain 
(3CLP"°) was identified in ORFla and predictions on putative cleavage 
sites in the C-terminal regions of ppla and pplab were made. The pro- 
teinase was found to be flanked by membrane domains on both sides. 
The coronavirus replicative proteins were proposed to be only extremely 
distantly related to the corresponding homologs of other +RNA viruses, 
and many of the ppla/pplab-encoded enzymes appeared to have unique 
structural properties. Thus, for example, the helicase was proposed to be 
linked at its N-terminus to a complex zinc-binding domain (ZBD) con- 
sisting of 12 Cys/His residues (see below). In several cases, mutations in 
otherwise strictly conserved signature sequences were found. Thus the 
typical G-D-D signature of the conserved RdRp motif VI (Koonin 1991) 
was found to be replaced by S-D-D in the coronavirus homolog and the 
G(A)-X-H motif conserved in the S1 subsite of the substrate-binding 
pocket of picornavirus 3C proteinases (Gorbalenya et al. 1989a, 1989c) 
was substituted with Y-M-H. The predictions on functional domains, 
putative active-site residues, and proteinase cleavage sites were continu- 
ously elaborated and extended when more coronavirus replicase se- 
quences became available (Gorbalenya et al. 1991; Lee et al. 1991; Herold 
et al. 1993; Eleouet et al. 1995; Chouljenko et al. 2001; Kocherhans et al. 
2001; Penzes et al. 2001; Ziebuhr et al. 2001; Snijder et al. 2003). In these 
studies, papainlike cysteine proteinase (PL?™’) domains (Gorbalenya et 
al. 1991), a conserved domain of corona-, alpha-, and rubiviruses, 
termed X’ (Gorbalenya et al. 1991), an acidic domain (Ac) of unknown 
function, and a domain (termed Y) with putative metal-binding and 
membrane-targeting functions (Ziebuhr et al. 2001) were identified in 
the coronavirus ORFla sequence (Fig. 1). Overall, the sequence similari- 
ties between the replicase genes of prototypic viruses from the three co- 
ronavirus groups corresponded well to those of the structural protein re- 
gions, providing support for the traditional classification of coron- 
aviruses into three groups, which previously was based on structural 
protein sequence relationships and serological cross-reactivities (Siddell 
1995). 

Recently, the list of putative enzymes involved in coronavirus RNA 
synthesis was extended considerably. Thus, in the context of a bioinfor- 
matics study of the SARS-CoV genome, as many as five (putative) coro- 


* The X domain has recently been predicted to be an adenosine diphosphate-ribose 
1”-phosphatase (ADRP). 
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naviral RNA processing activities were identified (Snijder et al. 2003) 
(Fig. 1). These include (1) a 3’-to-5’ exonuclease (ExoN) of the DEDD 
superfamily (Zuo and Deutscher 2001), (2) a poly(U)-specific endoribo- 
nuclease (XendoU) (Laneve et al. 2003), (3) an S-adenosylmethionine- 
dependent ribose 2’-O-methyltransferase (2’-O-MT) of the RrmJ family 
(Biigl et al. 2000), (4) an ADRP (Martzen et al. 1999), and (5) a cyclic 
phosphodiesterase (CPD) (Martzen et al. 1999; Nasr and Filipowicz 
2000). Four of the activities are conserved in all coronaviruses, indicat- 
ing their essential role in the coronaviral life cycle. In fact, the number 
of enzymes predicted to be involved in coronavirus RNA synthesis and 
modification is unique in RNA viruses and indicates a remarkable func- 
tional complexity, which approaches that of DNA replication. Three 
of the newly identified activities, ExoN (nsp14), XendoU (nsp15), and 
2’-O-MT (nsp16), are arranged in pplab as a single protein block down- 
stream of the RdRp (nsp12) and helicase (nsp13) domains (Fig. 1), sug- 
gesting that their activities cooperate in the same metabolic pathway(s). 
This conclusion is supported by the identification of a stable processing 
intermediate in IBV-infected cells that exactly comprises these three 
domains (Xu et al. 2001). It is also supported by the fact that nsp14-16 
expression involves common regulatory mechanisms, (1) ribosomal 
frameshifting and (2) 3CL?*°-mediated proteolysis. As a first clue to pos- 
sible functions encoded by this gene block in ORF1b, an exciting parallel 
to cellular RNA processing pathways was found by Snijder et al. (2003). 
Thus homologs of the coronavirus nsp14-16 processing products cleave 
and process mRNAs to produce small nucleolar (sno) RNAs that, in turn, 
guide specific 2’-O-ribose methylations of rRNA (Kiss 2001; Filipowicz 
and Pogacic 2002). 

Two other coronavirus domains, CPD and ADRP, both of which do 
not require ribosomal frameshifting for expression, were speculated to 
cooperate in a pathway that again has parallels in the cell. Thus two cel- 
lular homologs are known to mediate two consecutive steps in the down- 
stream processing of tRNA splicing products. In this pathway, CPD con- 
verts adenosine diphosphate ribose 1”-2” cyclic phosphate (Appr>p) to 
adenosine diphosphate ribose 1”-phosphate (Appr-1”-p) (Culver et al. 
1994) that, in a second reaction, is further processed (probably dephos- 
phorylated) by an ADRP homolog (Martzen et al. 1999). 

Obviously, the characterization of the substrate specificities of the 
newly identified enzymes will now be of major interest and may allow 
predictions or even conclusions on the functions of these proteins. Both 
(reverse) genetic and biochemical data will be required to answer the 
question of whether the RNA processing enzymes are directly involved 
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in the synthesis and/or processing of viral RNA or rather interfere with 
(and thereby reprogram) cellular pathways for the benefit of viral repli- 
cation (or even have other functions). 

The observed pattern of conservation in different nidovirus families 
suggests a functional hierarchy for the five RNA processing activities, 
with XendoU playing a central role. This enzyme is universally con- 
served in nidoviruses and was previously referred to as “nidovirus-spe- 
cific conserved domain” (Snijder et al. 1990b; den Boon et al. 1991; de 
Vries et al. 1997). In contrast, CPD is only encoded by toroviruses and a 
subset of group 2 coronaviruses (excluding SARS-CoV) (Snijder et al. 
2003). Given that coronaviruses and arteriviruses are generally believed 
to use very similar replication and transcription strategies, it is intrigu- 
ing that, out of the four activities conserved in all coronaviruses (ExoN, 
XendoU, 2’-O-MT, and ADRP), only one activity (XendoU) is conserved 
in arteriviruses. One may therefore speculate that (1) arterivirus and co- 
ronavirus RNA synthesis mechanisms differ in several molecular details 
or (2) the viruses interact differentially with RNA processing pathways 
of the host cell. Alternatively, the extra functions encoded by corona- 
and toroviruses (and, to a lesser extent, roniviruses) may be required to 
synthesize and maintain the extremely large (~30 kb) RNA genomes of 
these viruses. Thus, on the basis of its sequence similarity with cellular 
3’-to-5’ exonucleases involved in proofreading, repair, and/or recombi- 
nation, ExoN has been speculated to be involved in related mechanisms 
that may be required for the life cycle of corona-, toro-, and roniviruses 
but may be dispensable for the much smaller arteriviruses (Snijder et al. 
2003). The significance of the observation that overexpression of nsp14 
induces apoptotic changes in the host cell (Liu et al. 2001) remains to be 
further investigated. 


3.2 
Proteolytic Processing by Viral Cysteine Proteinases 


In common with many other +RNA viruses (Krausslich and Wimmer 
1988; Dougherty and Semler 1993), coronaviruses employ proteolytic 
processing as a key regulatory mechanism in the expression of their 
replicative protein functions (Ziebuhr et al. 2000). Proteinase inhibitors 
that block proteolytic processing also obviate coronavirus replication, il- 
lustrating the essential role of ppla/pplab processing for viral RNA syn- 
thesis (Kim et al. 1995). On the basis of their physiological role, corona- 
virus proteinases can be classified into accessory proteinases, which are 
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responsible for cleaving the more divergent N-proximal ppla/pplab re- 
gions at two or three sites, and main proteinases, which cleave the major 
part of the polyproteins at 11 conserved sites and also release the con- 
served key replicative functions, such as RdRp, helicase, and three of the 
RNA processing domains (Ziebuhr et al. 2000; Snijder et al. 2003). All 
coronaviruses encode one main proteinase and, depending on the virus 
(see below and Fig. 1), one or two accessory proteinases. The accessory 
proteinases are papainlike cysteine proteinases that are designated PL?"® 
(PL1?*° and PL2?"°). The main proteinase is a cysteine proteinase with a 
serine proteinase-like structure (Anand et al. 2002). In previous publica- 
tions, two alternative designations have been used for this protein. The 
name main proteinase, M?"®, is generally used to stress the dominant 
physiological role of this proteinase in coronavirus gene expression, 
whereas the name 3C-like proteinase is used to stress the (distant) rela- 
tionship with picornavirus 3C proteinases, which is based on a common 
chymotrypsin-like two-f-barrel structure and similar substrate speci- 
ficities (Gorbalenya et al. 1989a,c; Ziebuhr et al. 2000). Despite this rela- 
tionship, there are also important structural differences between pico- 
rnavirus and coronavirus chymotrypsin-like proteinases (see below). 

Peptide cleavage data obtained for several coronavirus main pro- 
teinases revealed differential processing kinetics for specific sites. The 
order of cleavages was found to be conserved among coronaviruses and 
appears to depend on the accessibility of specific sites in the context of 
the polyprotein (Pifion et al. 1999) as well as the primary and secondary 
structures of a given cleavage site. Thus deviation from the 3CL?™ cleav- 
age site consensus sequence, L-Q|(A,S,G), resulted in most cases in sig- 
nificantly reduced cleavage efficiencies (Ziebuhr and Siddell 1999; Hegyi 
and Ziebuhr 2002; Fan et al. 2003). Furthermore, substrate peptides 
adopting extended f-strand structures appear to be favored by 3CLP"® 
over a-helical or disordered structures (Fan et al. 2003). On the basis of 
these data, it is reasonable to postulate that coronavirus polyprotein 
processing occurs in a temporally coordinated manner, which might 
lead to activation and inactivation of specific functions in the course of 
the viral life cycle, as has been demonstrated for other +RNA viruses 
(Lemm et al. 1994; Vasiljeva et al. 2003). 

The combined data of numerous studies published in the past 15 
years provide a (nearly) complete picture of the ppla/pplab processing 
pathways of prototypic viruses from all three coronavirus groups 
(Fig. 1). Throughout this chapter, the replicase processing end products 
will be continuously numbered from nonstructural protein (nsp) 1 to 


66 J. Ziebuhr 


nspl6 (from N- to C-terminus’) to facilitate their comparison with ho- 
mologs from other coronaviruses. 


3.2.1 
Accessory Proteinases 


The N-proximal regions of the MHV and HCoV-229E replicase polypro- 
teins are processed by two PLP"’s at three sites to produce nsp1-4, with 
the C-terminus of nsp4 being cleaved by the main proteinase (Fig. 1). 
The proteolytic activities of the MHV and HCoV-229E PL1?"° and PL2P?*° 
domains and the IBV PL2?"°, which all reside in nsp3, have been charac- 
terized in detail (Ziebuhr et al. 2000). Briefly, the MHV PL1" cleaves 
the nsp1|nsp2 and nsp2|3 sites, while PL2?*° processes the third site, 
nsp3|nsp4 (Baker et al. 1989, 1993; Dong and Baker 1994; Denison et al. 
1995; Hughes et al. 1995; Bonilla et al. 1997; Teng et al. 1999; Kanjana- 
haluethai and Baker 2000; Kanjanahaluethai et al. 2003). Also in HCoV- 
229E, PL1? was shown to cleave the nspl|nsp2 and nsp2|nsp3 sites 
(Herold et al. 1998; Ziebuhr et al. 2001). However, in the case of HCoV- 
229E, the regulation of proteolytic processing was shown to be more 
complex than previously thought. Thus PL2?" (originally believed to 
process only the nsp3|nsp4 site) was demonstrated also to process the 
nsp2|nsp3 site. The nsp2|nsp3 cleavages mediated by PL1?"® and PL2?"°, 
respectively, were shown to occur at exactly the same scissile bond 
(Herold et al. 1998; Ziebuhr et al. 2001). Whereas the PL1?*°-mediated 
cleavage proved to be slow and incomplete in vitro, PL2?"° cleaved this 
site efficiently under the same experimental conditions. Furthermore, 
evidence was obtained to suggest that the proteolytic activity of PL1?"° 
at the nsp2|nsp3 site is downregulated by PL2’"° by a noncompetitive 
mechanism (Ziebuhr et al. 2001). It was concluded that the activities 
of the two proteinase domains present in nsp3 are tightly regulated 
in HCoV-229E and, probably, also other coronaviruses, with PL2?" play- 
ing a major role and dominating over the activity of PL1’"°. This conclu- 
sion is also supported by the conservation of PL2?"° in all coronaviruses 
(Ziebuhr et al. 2001; Snijder et al. 2003). 

IBV encodes only one proteolytically active PL’’°, which is PL2P". 
The IBV PL1”"° domain, although being conserved, has lost its proteo- 
lytic activity in the course of evolution because of the accumulation of 
active site mutations (Ziebuhr et al. 2001). Apparently, IBV does not en- 


* Note that similar designations (nsp or ns) are occasionally used for some of the 
group-specific nonstructural proteins encoded in the 3’-structural protein regions of 
coronaviruses (Brown and Brierley, 1995). 
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code a counterpart of the nsp1l protein of other coronaviruses. Thus 
there are only two cleavage sites in this region of ppla/pplab, nsp2|nsp3 
and nsp3|nsp4, which are both processed by PL2?"° (Lim and Liu 1998; 
Lim et al. 2000). In SARS-CoV, only one PLP*® is conserved (Marra et al. 
2003; Rota et al. 2003). The domain occupies a position in ppla/pplab 
that corresponds to that of the PL2P*° domains of other coronaviruses 
and therefore is considered an ortholog of coronavirus PL2P"°s (Snijder 
et al. 2003). Obviously, the SARS-CoV PL2?"° must be responsible for 
the processing of all three sites identified in this region and, indeed, the 
activity of PL2?"° at the nsp2|nsp3 site was demonstrated recently (Thiel 
et al. 2003). The arrangement of the N-terminal domains of SARS-CoV 
nsp3 differs from that of other coronaviruses (Ziebuhr et al. 2001; 
Snijder et al. 2003). Thus, the conserved ADRP domain (“X” in Fig. 1) 
resides immediately downstream of the acidic domain (Ac) in nsp3, a 
position that is occupied by PL1?"® in other coronaviruses. Further 
downstream, another domain of unknown function has been identified 
in the region separating the ADRP and PL2?"® domains. It has been 
termed “SARS-CoV unique domain” (SUD) (Snijder et al. 2003) (Fig. 1). 

The sequence similarity between coronaviral PL?'’s and the proto- 
typic cellular proteinases is very low. A closer relationship seems to 
exist between the active sites of coronavirus PL?'’s and the leader pro- 
teinase (LP*°) of the picornavirus foot-and-mouth-disease virus (FMDV) 
(Gorbalenya et al. 1991). Crystal structure analysis revealed that the ac- 
tive site of LP*° also diverged profoundly from its cellular homologs, 
which explains some of the unique biochemical properties of this en- 
zyme, such as salt sensitivity and narrow pH optimum (Guarné et al. 
1998, 2000). It remains to be studied whether the sequence affinity be- 
tween LP*® and coronavirus PL?'’s is associated with common structural 
and functional features. 

Only very few amino acids are absolutely conserved among coronavi- 
rus PL?'’s (Herold et al. 1999). Furthermore, there are only very few 
PL1*® versus PL2?"® lineage-specific residues, which do not provide suf- 
ficient evidence for clustering the PL1’"° and PL2?"° domains into two 
separate groups. Despite this divergency at the sequence level, coronavi- 
rus PLP'°s share a number of common properties. Thus they all (1) pro- 
cess sites that are located in the N-terminal half of the replicase polypro- 
teins, far upstream of the conserved ORF1b-encoded domains (Fig. 1), 
(2) cleave sites that have at least one small residue (Gly, Ala) at the scis- 
sile bond (Dong and Baker 1994; Hughes et al. 1995; Bonilla et al. 1997; 
Herold et al. 1998; Lim and Liu 1998; Lim et al. 2000; Ziebuhr et al. 2001; 
Kanjanahaluethai et al. 2003), (3) have a catalytic dyad consisting of Cys 
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(followed by Trp or Tyr) and a downstream His (Baker et al. 1993; 
Herold et al. 1998; Lim and Liu 1998), and (4) employ variants of the pa- 
painlike a+ fold (Gorbalenya et al. 1991; Herold et al. 1999). Molecular 
modeling suggests that the a and B domains are connected by a tran- 
scription factor-like domain that includes a zinc-binding domain (ZBD) 
essential for proteolytic activity (Herold et al. 1999) (Fig. 1). It seems 
likely that the domain also has other functions, for example, in sg 
mRNA transcription. This hypothesis is based on (1) the sequence simi- 
larity with cellular transcription factors (Herold et al. 1999) and (2) the 
fact that the related ZBD-containing EAV nsp1 papainlike proteinase has 
a clearly established role in arterivirus sg mRNA synthesis (Tims et al. 
2001). 

The presence of two PL?’’s in most coronavirus replicases suggests 
that these enzymes originated from the duplication of a PL?*® domain in 
one of the ancestors of the contemporary coronaviruses. Surprisingly, 
however, phylogenetic trees inferred from multiple sequence compar- 
isons of coronavirus PL?'°s revealed that only the PL1?"° and PL2?"® do- 
mains of the most closely related coronaviruses were clustered together 
(Ziebuhr et al. 2001). Therefore, multiple independent gene duplications 
in different coronaviruses cannot be excluded entirely. Alternatively and 
much more probably, the above result can be interpreted to reflect ho- 
moplasy events that, subsequent to the initial gene duplication, have 
driven a parallel evolution of the two coronavirus PL?" paralogs, while 
other regions of the replicase diverged much more profoundly (Ziebuhr 
et al. 2001). Often, such homoplasy events are driven by common sub- 
strates. Thus the identification of a common cleavage site that is pro- 
cessed by both PL1?"® and PL2?"° in HCoV-229E may indicate that, in 
this virus and probably also other coronaviruses, the conservation of 
overlapping substrate specificities was an important driving force of 
evolution. The underlying selective advantage that led to the conserva- 
tion of such a partial redundancy of two proteinase domains in most 
coronaviruses remains to be investigated. Conservation of overlapping 
substrate specificities also appears to affect the cleavage site structures. 
Thus a comparison of PL?" cleavage sites of SARS-CoV and IBV, which 
both employ only one PL?*® activity, with the corresponding cleavage 
sites of HCoV-229E, which employs two PL?*® domains, revealed a much 
better conservation of the IBV/SARS-CoV PL2P*° sites compared with 
the HCoV PL1?*°/PL2?"® sites (Thiel et al. 2003). 
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3.2.2 
Main Proteinase 


The coronavirus main proteinase, 3CLP"°, is encoded by ORFla and 
resides in nsp5 (Fig. 1). In the polyprotein, it is flanked by hydropho- 
bic domains. The ~33-kDa proteinase releases itself from ppla/pplab 
at flanking sites and directs the proteolytic processing of all down- 
stream domains of ppla/pplab (Fig. 1). In total, 3CLP"° cleaves at 11 
conserved sites to produce 13 processing end products and, probably, 
multiple intermediates. Because of the central role in the expression of 
the major replicative proteins, 3CLP"° is also called “main” proteinase 
(MP). 

Coronavirus 3CL*’’s represent a highly diverged branch of two-f-bar- 
rel proteinases (Gorbalenya et al. 1989a,c). In contrast to what the name 
suggests, coronavirus 3CL?'’s also deviate significantly from the pico- 
rnavirus 3C and other +RNA viral 3C-like proteinases. Characterization 
of a roniviral 3CL?*® has indicated that the 3C-like proteinases of 
potyviruses may represent the closest relatives of coronavirus 3CLP"°s 
(outside the Nidovirales order) (Cowley et al. 2000; Gorbalenya 2001; 
Ziebuhr et al. 2003). In common with the prototypic picornavirus 3C 
proteinases (Allaire et al. 1994; Matthews et al. 1994; Mosimann et al. 
1997), coronavirus 3C-like proteinases have a chymotrypsin-like, two-f- 
barrel fold that is formed by 12 antiparallel 6-strands (Allaire et al. 1994; 
Matthews et al. 1994; Mosimann et al. 1997; Anand et al. 2002, 2003). 
However, both the size and orientation of secondary structure elements 
vary considerably between the two groups of enzymes, making reliable 
structural alignments difficult, if not impossible. Furthermore, in con- 
trast to 3C proteinases but in common with other nidovirus 3C-like pro- 
teinases (Barrette-Ng et al. 2002; Ziebuhr et al. 2003), coronavirus 
3CLP*°s have a C-terminal extension, which is called domain III to dis- 
tinguish it from the 6-barrel domains I and II. Domain III of the TGEV 
3CLP"® comprises 103 amino acids and consists of 5 a-helices that adopt 
a unique structure that currently has no homologs in the database 
(Anand et al. 2002) (Figs. 2 and 3). The structure of the coronavirus 
3CLP*® domain III differs from the corresponding domain of the ar- 
terivirus nsp4 proteinase, which comprises only 49 residues and consists 
of 2 short pairs of B-strands and 2 a-helices (Barrette-Ng et al. 2002). 

The differences between picornavirus and coronavirus chymotryp- 
sin-like proteinases also extend to the catalytic residues. Thus, whereas 
the vast majority of picornavirus enzymes employ a catalytic triad, Cys- 
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Fig. 2. Sequence comparison of coronavirus 3C-like main proteinases. The align- 
ment was generated with the ClustalW program (version 1.82) (http://www.ebi.ac.uk/ 
clustalw/) and used as input for the ESPript program (version 2.1) (http://prodes. 
toulouse.inra.fr/ESPript/cgi-bin/ESPript.cgi). The 3CL?*® sequences of transmissible 
gastroenteritis virus (TGEV, strain Purdue 46), feline infectious peritonitis virus 
(FIPV, strain 79-1146), human coronavirus 229E (HCoV-229E), porcine epidemic di- 
arrhea virus (PEDV, strain CV777) bovine coronavirus (BCoV, isolate LUN), mouse 
hepatitis virus (MHYV, strain A59), avian infectious peritonitis virus (IBV, strain 
Beaudette), and SARS coronavirus (SARS-CoV, isolate Frankfurt 1) were derived 
from the replicative polyproteins of the respective viruses whose sequences are de- 
posited at the DDBJ/EMBL/GenBank database (accession numbers: TGEV, AJ271965; 
FIPV, AF326575; HCoV, X69721; PEDV, AF353511; BCoV, AF391542; MHV, NC 
001846; IBV, M95169; SARS-CoV, AY291315). The f-strands and a-helices as re- 
vealed by the TGEV 3CL?® crystal structure (Anand et al. 2002; PDB 1LVO) are 
shown above the sequence alignment. Catalytic Cys and His residues are indicated 
by asterisks 
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Fig. 3. Structure of monomer B of TGEV 3CL?"® with a hexapeptidyl chloromethyl 
ketone inhibitor bound to the active site (Anand et al. 2002, 2003). 3CL?*° domains 
I, II, and III are indicated. a-Helices are shown in red and are labeled A to E 
B-Strands are shown in green and are labeled a to f, followed by an indication of 
the domain to which they belong. Shown in ball-and-stick representation are the 
substrate analog inhibitor (residues P1 to P6), the catalytic residues (Cys144 and 
His41), and the S1 subsite His162 residue interacting with Tyr160 and the P1 Gln 
side chain of the substrate (see text for details). N- and C termini are labeled N 
and C 


His-Asp(Glu) (Allaire et al. 1994; Matthews et al. 1994; Mosimann et al. 
1997; Seipelt et al. 1999), which is reminiscent of the charge-relay system 
of chymotrypsin-like serine proteinases, the coronavirus 3CL?’°s use a 
catalytic dyad consisting of Cys (nucleophile) and His (general base) 
(Figs. 2 and 3). Mutation analyses performed with recombinant enzymes 
from different coronavirus species had consistently failed to identify a 
third catalytic residue, suggesting that coronavirus 3CL?*’s may lack a 
counterpart to the catalytic Asp(Glu) of other chymotrypsin-like pro- 
teinases (Liu and Brown 1995; Lu and Denison 1997; Ziebuhr et al. 
1997). This hypothesis was confirmed by crystal structure analyses of 
the TGEV (Anand et al. 2002), HCoV-229E (Anand et al. 2003), and 
SARS-CoV 3CLP"° enzymes (PDB acc: 1Q2W). Thus, for example, in the 
TGEV 3CLP” structure, a buried water molecule was found in the place 
that is normally occupied by the third member of the triad (Asp or Glu). 
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The water was hydrogen-bonded to His41° N°’, His163 N°’, and Asp186 
O°'. An equivalent water molecule is also found in the HCoV 3CL?”° 
structure. Here, it is stabilized by His41 N°’, Gln163 N°’, and Asp186 
O°'. The TGEV 3CL?"° structure also suggested that, after the attack of 
the active-site Cys144 nucleophile on the carbonyl carbon of the scissile 
bond, the developing oxyanion is stabilized by hydrogen bonds donated 
by the main chain amides of Gly142, Thr143, and Cys144, which togeth- 
er form the “oxyanion hole.” 

The substrate specificity of coronavirus 3CL?"’s resembles that of 
many other 3C and 3C-like proteinases (Blom et al. 1996; Ryan and Flint 
1997) in so far as all the coronavirus 3CL?”® sites share a Gln residue at 
the P1 position, whereas small residues (Ala, Ser, and Gly) are conserved 
at the P1’ position (Ziebuhr et al. 2000). Larger residues, such as Asn 
(which is found at the P1’ position of all coronavirus nsp8|nsp9 sites), 
result in significantly reduced cleavage efficiencies (Ziebuhr and Siddell 
1999; Hegyi and Ziebuhr 2002; Fan et al. 2003). Leu is strongly preferred 
at the P2 position of coronavirus 3CL’"® substrates, although other hy- 
drophobic residues, such as Ile, Val, Phe, and Met, are occasionally also 
found at this position. At the P4 position, small residues, Val, Thr, Ser, 
Pro, and Ala, are favored. The structural basis for the pronounced speci- 
ficity of coronavirus 3CL?'°s was elucidated recently by structure analy- 
sis of a hexapeptidyl chloromethyl ketone inhibitor bound to the active 
site of the TGEV 3CL?*® (Anand et al. 2003). Because the sequence of the 
inhibitor was derived from the P6-P1 region of a natural cleavage site 
(Val-Asn-Ser-Thr-Leu-Gln) of TGEV 3CLP", the structure most likely 
represents the binding mode of coronavirus 3CL?"® substrates in general. 
It was found that the P region of 3CL?*° substrates binds in a shallow 
groove at the surface of the proteinase, between domains I and II 
(Fig. 3). Residues P5 to P3 form an antiparallel B-sheet with residues 
164-167 of strand elI and residues 189-191 of the loop linking domains 
II and III. Deletion of the loop region abolishes the proteolytic activity 
of 3CLP"°, supporting the functional significance of the interaction be- 
tween the substrate and this loop region (Anand et al. 2002). 

The conserved Gln side chain at the P1 position of 3CL?*® substrates 
interacts with the imidazole of His162 (Fig. 3), at the bottom of the S1 
subsite, which is formed by the main-chain atoms of Ile51, Leul64, 
Glul65, and His171 (Anand et al. 2003). The neutral state of His162 over 
a broad pH range appears to be maintained by (1) stacking onto the 


> Amino acid residues of coronavirus 3CL?’°s are numbered from Ser(Ala)1 to 
Gln302. 
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phenyl ring of Phe139 and (2) accepting a hydrogen bond from the hy- 
droxyl group of the buried Tyr160. This interpretation is supported by 
mutagenesis data obtained for bacterially expressed HCoV-229E and fe- 
line infectious peritonitis (FIPV) 3CL?'°s (Ziebuhr et al. 1997; Hegyi et 
al. 2002). Tyr160 is part of the conserved coronavirus 3CL?"® signature, 
Tyr-X-His, whereas Gly(Ala)-X-His is found at the equivalent sequence 
position in most 3C and 3C-like proteinases (Gorbalenya et al. 1989a; 
Gorbalenya and Snijder 1996). Accordingly, stabilization of histidine 
in the neutral tautomeric state needs to be ensured by other residues 
(Bergmann et al. 1997; Mosimann et al. 1997). 

The hydrophobic S2 subsite of the proteinase, which accommodates 
the conserved Leu residue and, in few cases, other hydrophobic residues, 
is formed by the side chains of Leu164, Ile51, Thr47, His41, and Tyr53 
(Anand et al. 2003). The fact that, in the structure, the P3 side chain of 
the substrate analog was oriented toward bulk solvent explains why 
there is no specificity for any particular side chain at the P3 position of 
coronavirus 3CL?"® cleavage sites (Ziebuhr et al. 2000). The S4 site is 
rather congested (Anand et al. 2003), explaining the conservation of 
small residues, such as Ser, Thr, Val, or Pro, at this position of coronavi- 
rus 3CLP"® substrates. On the basis of the TGEV 3CLP*°-inhibitor struc- 
ture, it has been proposed that the relatively small P1’ residues (Ser, Ala, 
or Gly) may be accommodated by a S1’ subsite that involves Leu27, 
His41, and Thr47 (Anand et al. 2003). 

It is generally believed that most of the ppla/pplab cleavages are me- 
diated in trans by the fully processed form of 3CL?"® (nsp5). The trans 
activity of 3CL?*® has been well characterized, both biochemically and 
structurally (Ziebuhr et al. 1995; Gr6dtzinger et al. 1996; Lu et al. 1996; 
Heusipp et al. 1997a,b; Tibbles et al. 1999; Ziebuhr and Siddell 1999; 
Anand et al. 2002, 2003; Hegyi and Ziebuhr 2002; Fan et al. 2003). How- 
ever, it is not clear whether 3CL?"° cleaves itself from ppla/pplab in cis 
or in trans. Also, it is not clear whether 3CL?*® can cleave downstream 
ppla/pplab sites in cis. Thus, on the one hand, there is biochemical and 
structural evidence to suggest that 3CL?"” self-processing occurs in trans 
(Lu et al. 1996; Anand et al. 2002). Furthermore, in MHV-infected cells, 
3CLP*® was found to be part of a rather stable 150-kDa processing inter- 
mediate (nsp4-10 or nsp4-11), which also argues against a rapid, co- 
translational release of 3CL?®° in cis (Schiller et al. 1998). On the other 
hand, a number of MHV and IBV 3CL?"°-containing precursors were 
shown to require microsomal membranes for efficient autocatalytic re- 
lease of 3CL?"® from the flanking TM2 (nsp4) and TM3 (nsp6) domains 
(Tibbles et al. 1996; Pifion et al. 1997), indicating that the flanking do- 
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mains (when properly folded) affect the activity of 3CLP*°. In other 
words, interdomain interactions in pplab may modulate the structure 
(and activity) of the enzyme, for example, to render 3CL’"® competent 
for cis cleavages at flanking sites or even further downstream sites. In 
fact, one might expect that at least some of the ppla/pplab cleavages 
need to occur in cis early in infection, when the concentration of 3CLP"” 
is low and intermolecular reactions are less likely to occur. Otherwise, if 
there were no cis cleavages at all, ppla/pplab should operate initially as 
an extremely large polyprotein that is only processed at its N-terminus 
by PLP" cleavages. Structure information for larger 3CL?"® precursors 
will be required to answer the question of whether or not 3CL?"® adopts 
alternative conformations in its fully processed form and larger precur- 
sor molecules. Notably, reorientation of secondary structure elements af- 
ter intramolecular release is believed to occur in picornavirus 3C pro- 
teinases (Khan et al. 1999), illustrating the significance of this question. 
At present, structure information is only available for the fully pro- 
cessed coronavirus 3CL?® (Anand et al. 2002, 2003). Both the crystal 
structures and dynamic light scattering data show that 3CL?"° forms di- 
mers (Anand et al. 2002, 2003). The two molecules in the dimer are ori- 
ented perpendicular to one another (Fig. 4). The contact interface main- 
ly involves conserved residues of the N-terminus of one molecule and 
domain II of the other molecule (and vice versa). The N-terminal amino 
acid residues are squeezed in between domains II and III of the parent 


Fig. 4. Coronavirus main proteinases form dimers (Anand et al. 2002). Stereo repre- 
sentation of a Ca plot of a TGEV 3CL’*® dimer (PDB accession number: 1LVO). 
Monomers A and B are shown in blue and red, respectively. The monomers are ori- 
ented perpendicular to one another. Dimerization mainly involves interactions of 
the N terminus with domain II of the other dimer (see text for details). The N termi- 
ni of monomers A and B are shown in green and brown, respectively 
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monomer and domain II of the other monomer, where they make a 
number of very specific interactions that appear tailor-made to bind this 
segment with high affinity. Apparently, this mechanism allows the active 
site to remain competent for binding and cleaving other sites in the 
polyprotein after autocleavage of 3CLP"°. In addition, the exact place- 
ment of the N-terminus seems to have a structural role for the mature 
3CLP*°, because deletion of residues 1 to 5 leads to a dramatic decrease 
in proteolytic activity (Anand et al. 2003). It has been speculated that 
the tight interaction of the N-terminus with domains II and III may help 
to maintain the loop connecting domains II and III in the orientation re- 
quired to bind the P3-P5 residues of the substrate (Anand et al. 2002, 
2003). The presumed indirect role of domain III in proteolysis may ex- 
plain the results from previous mutagenesis studies that consistently 
reported a dramatic loss of trans-cleavage activity with C-terminally 
truncated forms of HCoV-229E, TGEV, MHV, and IBV 3CLP’°s (Lu and 
Denison 1997; Ziebuhr et al. 1997; Ng and Liu 2000; Anand et al. 2002). 

Genetic data also point to a (direct or indirect) role of domain III in 
RNA synthesis. Thus characterization of temperature-sensitive (ts) MHV 
mutants revealed that substitution of the MHV 3CLP"° Phe219 residue, 
which is part of the loop connecting a-helices B and C in domain III 
(Fig. 2), with Leu causes an RNA-minus phenotype at the restrictive 
temperature (Siddell et al. 2001). Further characterization of the ts mu- 
tant, Alb tsl6, showed that both plus- and minus-strand synthesis was 
not greatly affected when the temperature was shifted late in infection. 
However, when the temperature was shifted to the nonpermissive tem- 
perature early, at a time when the rate of MHV RNA synthesis increases 
rapidly, no increase of plus-strand synthesis was observed with Alb ts16. 
Furthermore, inhibition of minus-strand synthesis (by inhibition of pro- 
tein synthesis) was found to cause a decline of plus-strand synthesis af- 
ter 30-60 min. The data can be interpreted to indicate that the defect in 
3CLP"® activity interferes with minus-strand synthesis and reduces it to 
a low level that merely ensures the replenishment of minus strands being 
lost because of turnover. Alternatively, the mutation may cause a defect 
in the activity of 3CL?"° that blocks the formation of plus-strand poly- 
merase activity (or prevents its conversion from the minus strand-syn- 
thesizing precursor). It remains to be determined whether the observed 
ts phenotype is caused by specific defects in the proteolytic activity of 
3CLP*"° or whether another, nonproteolytic function of domain III is af- 
fected. Thus, for example, protein-protein interactions involving domain 
III—as proposed to be mediated by the C-terminal domain of the EAV 
nsp4 proteinase (Barrette-Ng et al. 2002)—may be affected. 
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Fig. 5. Differential orientation of the C-terminal domains III of TGEV and SARS- 
CoV 3C-like main proteinases (PDB 1LVO and 1Q2 W). Superimposition (stereo im- 
age) of TGEV (orange) and SARS-CoV (blue) 3CL?"°s shows little variation between 
the structures of the N-terminal B-barrel domains I and II. The orientation (rather 
than the structure) of the respective C-terminal domains of TGEV and SARS-CoV 
3CLP"° differs slightly in the two proteins, resulting in less perfect superimposition 


Comparison of coronavirus main proteinase structures shows that 
domains I and II superimpose much better than the C-terminal domains 
III (Fig. 5). This is mainly due to a slightly different orientation of do- 
main III in relation to domains I and II rather than differences in the 
domain III structures themselves. 


3.3 
Helicase 


RNA helicases represent the second most conserved subunit of the 
RNA synthesis machinery of +RNA viruses and are involved in diverse 
steps of the viral life cycle (Buck 1996; Kadaré and Haenni 1997). They 
utilize the energy derived from hydrolysis of nucleoside triphosphates 
(NTPs) to unwind double-stranded (ds) RNA. Conservation of specific 
sequence motifs allows helicases to be classified into three large super- 
families (SFs), termed SF1, SF2, and SF3, as well as several small families 
(Gorbalenya et al. 1989b; Gorbalenya and Koonin 1993). The coronavi- 
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rus helicase resides in nsp13 and has been classified as belonging to SF1 
(Gorbalenya et al. 1989b, c) (Fig. 1). Nsp13 and its homologs in other 
nidoviruses have a putative zinc-binding domain (ZBD) at their N-ter- 
minus (Gorbalenya et al. 1989c), which is known to be required for the 
enzymatic activities of coronavirus and arterivirus helicases (Seybert, 
van Dinten, Posthuma, Snijder, Gorbalenya, and Ziebuhr, unpublished 
data). EAV reverse genetics data have shown that the ZBD and a down- 
stream segment (“hinge spacer”) that links ZBD to the C-terminal heli- 
case domain have distinct functions in arterivirus replication, sg mRNA 
transcription, and virion morphogenesis (van Dinten et al. 2000). It is 
tempting to suggest that coronavirus helicases may have similarly di- 
verse functions. Biochemical characterization of a recombinant form of 
HCoV-229E nsp13 demonstrated both nucleic acid-stimulated NTPase 
and duplex-unwinding activities (Seybert et al. 2000a). Similar data have 
subsequently been obtained for two arterivirus nsp10 helicases and the 
SARS-CoV nsp13 helicase (Seybert et al. 2000b; Bautista et al. 2002; Tan- 
ner et al. 2003; Thiel et al. 2003). 

Coronavirus (and arterivirus) helicases were shown to unwind their 
dsRNA substrates with 5’-to-3’ polarity, that is, they move in a 5’-to-3’ 
direction on the strand to which they initially bind (Seybert et al. 2000a, 
b). Obviously, this stands in contrast to the 3’-to-5’ polarity of the SF2 
helicases of flavi-, pesti-, and hepaciviruses (Kadaré and Haenni 1997; 
Kwong et al. 2000) and may indicate fundamental differences in biologi- 
cal functions between the two groups of enzymes. For example, the 
5’-to-3' polarity of the coronavirus nsp13 helicase activity argues against 
a role in the separation of secondary structures in the RNA template dur- 
ing minus-strand synthesis (as has been suggested for RNA viral SF2 he- 
licases), because this would require a helicase with 3’-to-5’ polarity. 

Interestingly, coronavirus nsp13 is one of the few helicases that have 
no marked preference for RNA or DNA substrates. Thus they have been 
found to unwind partial-duplex DNA substrates with high efficacy 
(Seybert et al. 2000; Thiel et al. 2003). This property allows DNA-based 
assays to be used in the characterization of coronavirus helicases (for 
example, in mutagenesis studies and high-throughput tests of potential 
inhibitors). Because coronaviruses replicate in the cytoplasm and the he- 
licase has not been found to localize to the nucleus (Sims et al. 2000; 
Bost et al. 2001), a biological significance of the DNA-unwinding activity 
of nsp13 seems unlikely, although it cannot be excluded entirely at the 
present stage. It should be mentioned in this context that the hepatitis C 
virus (HCV) NS3 helicase also has DNA duplex-unwinding activity, 
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which, however, has been proposed to affect the structure of host cell 
DNA (Pang et al. 2002). 

Duplex unwinding by coronavirus helicases is an energy-dependent 
process that derives its energy from NTP hydrolysis (Seybert et al. 
2000a; Seybert and Ziebuhr 2001). Coronavirus helicases appear to be 
highly promiscuous with respect to the NTP cofactor used. Thus all 
standard NTPs and dNTPs were found to be hydrolyzed by coronavirus 
helicases (Seybert et al. 2000a; Seybert and Ziebuhr 2001; Tanner et al. 
2003). Finally, coronavirus helicases possess RNA 5’-triphosphatase ac- 
tivity that may be involved in the formation of the 5’ RNA cap structure 
of coronavirus plus-strand RNAs (Ivanov et al. 2004; Ivanov and Ziebuhr 
2004). 


3.4 
RNA-Dependent RNA Polymerase 


As discussed above for other coronavirus ppla/pplab proteins, the 
RdRp domain also differs substantially from its homologs in other 
+RNA viruses. Coronavirus RdRps and their nidovirus relatives have 
been classified as an outgroup of SFl RdRps (Koonin 1991). The corona- 
virus RdRp domain comprising the finger, palm, and thumb subdo- 
mains occupies the C-terminal two-thirds of nsp12 (Gorbalenya et al. 
1989c). Recent data suggest that replication complex association of the 
RdRp may occur through interactions of the nsp12 segment 411-448 (lo- 
cated upstream of the RdRp core domain in nsp12) with ORFla-encod- 
ed proteins, such as nsp5 (3CLP"’), nsp8, and nsp9 (Brockway et al. 
2003). Consistent with the presumed RdRp activity of nsp12, a mutation 
in nsp12 (His868 to Arg) was found to cause an RNA-negative pheno- 
type in an MHV ts mutant, Alb ts22 (Siddell et al. 2001). Thus, when in- 
fected cultures of Alb ts22 were shifted to the restrictive temperature at 
40°C, both plus- and minus-strand RNA synthesis ceased immediately. 
Even at the permissive temperature, the ts mutant synthesized 4-5 times 
less RNA compared with revertants. The defect of this mutant in RNA 
synthesis can easily be explained by the fact that His868 is part of the 
predicted thumb subdomain of the MHV RdRp that, in other RNA poly- 
merases, has been implicated in polymerase activity (Burns et al. 1989; 
Mills et al. 1989; Plotch et al. 1989; Hansen et al. 1997). 

The Cys/His-rich nsp10 that immediately precedes RdRp in pplab 
(Fig. 1) has also been implicated in RNA synthesis. An MHV ts mutant, 
Alb ts6, encoding a mutant form of nsp10 (Gln65 to Glu), was shown to 
have a defect in minus-strand RNA synthesis (Siddell et al. 2001). Thus, 
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when the temperature was shifted to 40°C, minus-strand synthesis 
stopped immediately but plus-strand synthesis continued at the same 
level as was occurring at the time of temperature shift. Plus-strand RNA 
synthesis gradually declined over 3-4 h (starting at 30-60 min after the 
shift to 40°C) because the minus strands produced at the permissive 
temperature were turned over (Wang and Sawicki 2001) and, because of 
the defect in their synthesis, were not replenished at the restrictive tem- 
perature. 

Nsp10 and nsp12 (RdRp) are adjacent domains in pplab (Fig. 1). 
Peptide cleavage data have shown that, most likely because of a replace- 
ment of the conserved P2 Leu residue, the nsp10|nsp12 cleavage site is 
less efficiently cleaved than other SARS-CoV 3CLP™ sites (Fan et al. 
2003). Also, the nsp10|nsp12 sites of other coronaviruses have the P2 po- 
sition occupied by noncanonical residues. It is thus tempting to specu- 
late that the nsp10|nsp12 site has to be cleaved more slowly than other 
sites, probably to attain a specific activity mediated by an nsp10-nsp12- 
containing intermediate. The IBV nsp10 has been reported to form di- 
mers. It localizes to membranes near the site of viral RNA synthesis (Ng 
and Liu 2002). 


4 
Subcellular Localization of the Coronavirus Replicase 


Genome replication and transcription of virtually all +RNA viruses takes 
place at intracellular membranes that are derived from various cellular 
organelles including, for example, the endoplasmic reticulum, lysosomes 
and endosomes, intermediate compartment and trans-Golgi network, 
peroxisomes, mitochondria, and chloroplasts (Russo et al. 1983; 
Froshauer et al. 1988; Perdnen and Kdadridinen 1991; De Graaff et al. 
1993; Peranen et al. 1995; Restrepo-Hartwig and Ahlquist 1996; Schaad 
et al. 1997; van der Meer et al. 1998; Mackenzie et al. 1999; Restrepo- 
Hartwig and Ahlquist 1999; Miller et al. 2001). The viral replication 
complex, which consists of multiple viral but also cellular subunits (see 
the chapter by Shi and Lai, this volume), is associated with these mem- 
branes and, in many cases, also directs their synthesis and/or modifica- 
tion (Perdnen and K4aaridinen 1991; Cho et al. 1994; Schlegel et al. 1996; 
Teterina et al. 1997; Snijder et al. 2001; Egger et al. 2002). Typically, mul- 
tiple vesicles or membrane invaginations (spherules) on cellular or- 
ganelles are induced to which the replication complex is attached by spe- 
cific structural elements, such as hydrophobic domains (van Kuppeveld 
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et al. 1995; Snijder et al. 2001) amphipathic helices (Datta and Dasgupta 
1994), palmitate side chains (Laakkonen et al. 1996), and C-terminal 
membrane insertion sequences (Schmidt-Mende et al. 2001). As a result, 
replication takes place in a membrane-protected (and, thus, nuclease re- 
sistant) microenvironment that contains (and sequesters) the protein 
functions required for viral RNA synthesis. This strategy is believed to 
improve template specificity by retaining negative strands for template 
use and to repress host defenses that may be induced by double-strand- 
ed RNA (Schwartz et al. 2002). 

Association of the viral replication/transcription complex with intra- 
cellular membranes has also been established for coronaviruses (Sethna 
and Brian 1997). Thus TGEV genome- and subgenome-length minus 
strands, which are the templates for viral genome RNA replication 
and subgenomic mRNA transcription, respectively (Sethna et al. 1989; 
Sawicki and Sawicki 1990; Schaad and Baric 1994; Sawicki et al. 2001), 
were predominantly found in nuclease-resistant membranous complex- 
es. In contrast, positive-strand RNAs proved to be much more suscepti- 
ble to nuclease digestion, indicating that plus-strand RNAs, which also 
act as mRNAs, are mainly in solution or part of easily dissociable com- 
plexes in the cytosol (Sethna and Brian 1997). 

Immunofluorescence (IF) studies provided clear evidence that the 
vast majority of coronavirus replicase subunits localize to perinuclear 
membrane compartments (Heusipp et al. 1997a; Bi et al. 1998; Denison 
et al. 1999; Shi et al. 1999; van der Meer et al. 1999; Ziebuhr and Siddell 
1999; Bost et al. 2000; Sims et al. 2000; Bost et al. 2001; Xu et al. 2001; Ng 
and Liu 2002). Whereas most ORFla-encoded replicase components re- 
main tightly associated with membranes throughout the viral life cycle, 
at least some of the ORF1b-encoded subunits seem to be only temporar- 
ily present in the complex, probably when still part of the polyprotein. 
Thus, for example, partial detachment from the membrane-bound com- 
plexes was reported for MHV nsp12 and nsp13 later in infection (van 
der Meer et al. 1999; Bost et al. 2001; Xu et al. 2001). Also, the most 
C-terminal IBV pplab processing products show, in contrast to all other 
IBV ppla/pplab proteins tested, a diffuse, cytoplasmic staining pattern 
in IF experiments (van der Meer et al. 1999; Bost et al. 2001; Xu et al. 
2001). The membrane-bound replicase proteins overlap to a large extent 
with the site of viral RNA synthesis (Denison et al. 1999; Shi et al. 1999; 
van der Meer et al. 1999; Bost et al. 2001; Gosert et al. 2002; Ng and Liu 
2002). There is some controversy regarding the intracellular compart- 
ment at which viral RNA synthesis takes place and, in particular, the cel- 
lular origin of the membranes employed. In a recent EM study (Gosert 
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et al. 2002), virus-induced double membrane vesicles (DMVs) were re- 
ported to be the site of MHV-A59 replication and transcription in HeLa- 
MHVR (Gallagher 1996) and 17CL-1 cells. These DMVs have a diameter 
of 200-350 nm and consist of a double membrane that, occasionally, is 
fused into a trilayer. At the time of maximum RNA synthesis, both ge- 
nome- and subgenome-length positive-strand RNA was detected on 
DMVs by in situ hybridization, and also the results of BrUTP labeling 
suggest that DMVs are the site of viral RNA synthesis. The subcellular 
origin of the DMVs has not been determined to date. However, a previ- 
ous IF study (Shi et al. 1999) using MHV-A59-infected 17CL-1 and 
HeLa-MHVR cells suggested that N-terminal ppla/pplab proteins and 
newly synthesized RNA colocalize with ER- or Golgi-derived mem- 
branes, depending on the cell type studied. 

In clear contrast to these results, another study revealed that, in 
MHV-A59-infected L cells at 5 h p.i., the C-terminal ppla region (CT 1a), 
3CLP*® (nsp5), RdRp (nsp12), helicase (nsp13), and the N protein are as- 
sociated with virus-induced, late endosomal/lysosomal membranes, 
which were confirmed to be the site of RNA synthesis (van der Meer et 
al. 1999). In IF experiments, the sites of maximum CT1la accumulation 
overlapped only partially with those of nsp5, nsp12, and nsp13. A thor- 
ough EM study suggested that the low (albeit significant) degree of colo- 
calization of CTla and nsp12 is probably due to the existence of two dis- 
tinct types of membrane structures that are closely adjacent to each oth- 
er but have different morphologies and protein compositions. Thus 
CT1la was found to be associated mainly with endosomes, whereas the 
majority of nsp12 was associated with multilayered membranes, proba- 
bly originating from invaginations on continuous membrane sheets. The 
latter structures were morphologically reminiscent of endocytic carrier 
vesicles (ECVs) or multivesicular bodies (MVBs). However, the fact that 
many of these structures had membrane continuities to late endosomes 
argues against typical ECVs and rather favors the idea that both the mul- 
tivesicular (carrying the bulk of CTla) and multilayered (carrying the 
bulk of nsp12) structures represent different subdomains of the same 
endocytic compartment. Most intriguingly, it has also been found (van 
der Meer et al. 1999) that CTla- and nsp12-positive membranes appear 
to be secreted. Similar observations have also been reported recently for 
endosome-derived cytoplasmic vacuoles carrying the alphavirus replica- 
tion complex (Kujala et al. 2001). The functional significance of this phe- 
nomenon is currently unclear but may have parallels in the regulated ly- 
sosomal secretion systems employed by, for example, lymphocytes 
(Stinchcombe and Griffiths 1999). 
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The existence of two closely associated but physically distinct mem- 
brane compartments was also shown by iodixanol gradient centrifuga- 
tion of intracellular membranes isolated from MHV-A59-infected DBT 
cells (Sims et al. 2000). The ORFla-encoded proteins nsp2 (p65) and 
nsp8 (p22) cofractionated with membranes with a buoyant density of 
1.05-1.09 g/ml. In contrast, nsp13, the N protein, nsp1 (p28), and newly 
synthesized RNA were detected in another membrane fraction of 1.12- 
1.13 g/ml. Both membrane fractions were LAMP-1 positive, confirming 
previous conclusions on the endosomal/lysosomal origin of the MHV 
replication compartment. Interestingly, later in infection, there appears 
to be a translocation of nsp13 and the N protein to the ER/cis-Golgi 
compartment, resulting in colocalization of these two proteins with the 
M protein at the site of virion assembly (Bost et al. 2001). The combined 
data suggest a multipartite structure of the coronavirus replication com- 
plex, with the N protein playing a specific role in RNA synthesis as sug- 
gested earlier (Compton et al. 1987; Baric et al. 1988). Apparently, the 
coronavirus replication complex undergoes structural rearrangements at 
the transition from maximum RNA synthesis to virion assembly at later 
time points (8-12 h p.i.). If this is confirmed, the localization of nsp13 at 
the site of assembly may correspond with a specific role of nsp13 in viri- 
on biogenesis. Such an activity has also been proposed for the related 
arterivirus nsp10 helicase (van Dinten et al. 1999, 2000; Seybert et al. 
2000b). 

To date, the mechanisms by which components of the coronavirus 
replication complex are integrated in or attached to intracellular mem- 
branes have not been elucidated in detail. However, it seems very likely 
that the strongly hydrophobic domains, TM1 to TM3 (see Fig. 1), that 
are present in nsp3, nsp4, and nsp6 (Gorbalenya et al. 1989c; Ziebuhr et 
al. 2001) play a major role in this process. This hypothesis is supported 
by arterivirus data showing that homologous hydrophobic domains 
present in EAV nsp2 and nsp3 are necessary and sufficient to trigger the 
synthesis of the membrane structures carrying the arterivirus replica- 
tion complex (Pedersen et al. 1999; Snijder et al. 2001). The fact that sev- 
eral MHV ppla/pplab processing products including nsp3 (Gosert et al. 
2002) and nsp4-10(11) (Schiller et al. 1998), which contain TM1 and 
TM2/TM3, respectively, are integral membrane proteins strongly sug- 
gests a scaffold function for these proteins. There is also biochemical ev- 
idence indicating that the majority of ORFla-encoded proteins and, to a 
lesser extent, ORF1b-encoded proteins are tightly bound in the complex 
(Gosert et al. 2002). The precise protein-protein and protein-RNA inter- 
actions stabilizing this complex remain to be characterized. 
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5 
Concluding Remarks 


Although much has been learned about coronavirus replicase organiza- 
tion, localization, proteolytic processing, and some of the viral replica- 
tive enzymes (e.g., proteinases and helicases), there are still major gaps 
in our knowledge. Given the availability of full-length clones of coron- 
aviruses, directed genetic analysis is now possible (Almazan et al. 2000; 
Yount et al. 2000; Casais et al. 2001; Thiel et al. 2001a; Yount et al. 2002, 
2003). In vivo studies as well as biochemical and structural information 
should yield important new information on the molecular details of 
coronaviral RNA synthesis. In this context, it will be of particular inter- 
est to define the proteins that are responsible for the unique features of 
coronavirus RNA synthesis, for example, the production of an extensive 
set of 5’- and 3’-coterminal subgenomic RNAs and the synthesis and 
maintenance of RNA genomes of this unique size. Studies on coronavi- 
rus replicases and their homologs on closely related viruses may also 
help to determine the structural and functional constraints that have 
driven the evolution of nidoviruses and enable them to infect a broad 
range of vertebrate and invertebrate hosts. Furthermore, the relationship 
of the recently identified coronavirus RNA processing activities with cel- 
lular proteins may reveal interesting insights into similarities and differ- 
ences (or even an interplay) between coronaviral and cellular RNA me- 
tabolism pathways. In the long term, the unique structural properties of 
coronavirus replicative enzymes may allow the development of very se- 
lective enzyme inhibitors and possibly even drugs suitable to combat co- 
ronavirus infections. 
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